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Belnap-Dunn Logic and Query Answering 
in Inconsistent Databases with Null Values 
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Abstract 


This paper concerns an expansion of first-order Belnap-Dunn logic, 
called BD?”, and an application of this logic in the area of relational 
database theory. The notion of a relational database, the notion of a 
query applicable to a relational database, and several notions of an 
answer to a query with respect to a relational database are considered 
from the perspective of this logic, taking into account that a database 
may be an inconsistent database and/or a database with null values. 
The chosen perspective enables among other things the definition of 
a notion of a consistent answer to a query with respect to a possibly 
inconsistent database without resort to database repairs. For each of 
the notions of an answer considered, being an answer to a query with 
respect to a database of the kind considered is decidable. 


Keywords: relational database, inconsistent database, null value, 
consistent query answering, Belnap-Dunn logic, indeterminate value. 


1 Introduction 


In the area of relational database theory, it is quite common since the 1980s 
to take the view that a database is a set of formulas of first-order classical 
logic, a query is a formula of first-order classical logic, and query answering 
amounts to proving that a formula is a logical consequence of a set of formulas 
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in first-order classical logic (see e.g. [11, 26]). The use of first-order classical 
logic makes it difficult to take into account the possibility that a database is 
an inconsistent database and the possibility that a database is a database 
with null values. 


In the logical view of relational databases, an inconsistent relational 
database is an inconsistent set of formulas in the sense that there exists a 
formula such that both that formula and its negation are logical consequences 
of the set of formulas. Because in classical logic every formula is a logical 
consequence of each two formulas of which one is the negation of the other, 
it becomes difficult to take the possibility that a database is inconsistent 
properly into account. A logic in which not every formula is a logical 
consequence of each two formulas of which one is the negation of the other 
is called a paraconsistent logic. 


The view on null values in relational databases taken in this paper 
can be described as follows: (a) in relational databases with null values, 
a single dummy value, called the null value, is used for values that are 
indeterminate, (b) a value that is indeterminate is a value that is either 
unknown or nonexistent, and (c) independent of whether it is unknown or 
nonexistent, no meaningful answer can be given to the question whether the 
null value and whatever value, including the null value itself, are the same. 
Several variations on this view on null values in relational databases have 
been studied (see e.g. [10, 14, 30]). However, those variations are less basic 
and most of them give rise to similar problems in a logical view of relational 
databases with null values. 


In the logical view of relational databases, a relational database with null 
values is an incomplete set of formulas in the sense that there exists a formula 
such that neither that formula nor its negation is a logical consequence of the 
set of formulas. Because in classical logic, for each two formulas of which one 
is the negation of the other, one or the other is a logical consequence of every 
set of formulas, it becomes difficult to take the possibility that null values 
occur in a database properly into account. A logic in which not, for each 
two formulas of which one is the negation of the other, one or the other is a 
logical consequence of every set of formulas is called a paracomplete logic. 


In [21], the notion of a relational database, the notion of a query 
applicable to a relational database, and several notions of an answer to a query 
with respect to a relational database are considered from the perspective of 
LPQ>?*, an expansion of the first-order version of Priest’s logic of paradox 
known as LPQ [25]. The possibility that a database is an inconsistent 
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database is taken into account, but the possibility that a database is a 
database with null values is not taken into account. The reason for this is 
that LPQ>°* is paraconsistent, but not paracomplete. There are many other 
paraconsistent logics. The choice of LPQ°* stems from the fact that its 
connectives and quantifiers are all familiar from classical logic and its logical 
consequence relation is very closely connected to the one of classical logic. 


In [22], BD>*, an expansion of first-order Belnap-Dunn logic [1] with 
classical connectives, is introduced and studied with the emphasis on (a) the 
connection between the logical consequence relations of BD°* and the 
version of classical logic with the same connectives and quantifiers and 
(b) the definability in BD°* of interesting non-classical connectives added to 
Belnap-Dunn logic in its expansions that have been studied earlier. BD°* 
is both paraconsistent and paracomplete. Like for LPQ>°*, it holds for 
BD-* that its connectives and quantifiers are all familiar from classical logic 
and its logical consequence relation is very closely connected to the one of 
classical logic. 


An appendix of [22] goes briefly into an minor variation of BD°* that 
covers terms with an indeterminate value. This variation, called BD?", is 
also paraconsistent and paracomplete. Moreover, its treatment of equations 
corresponds to the above-mentioned thought that no meaningful answer 
can be given to the question whether the null value and whatever value, 
including the null value itself, are the same value. All this means that 
both the possibility that a database is an inconsistent database and the 
possibility that a database is a database with null values can be properly 
taken into account when the notions considered in [21] are considered from 
the perspective of BD?”. 


This is taken up in the current paper. In order to make this paper 
self-contained, the language and logical consequence relation of BD°* as 
well as a sequent calculus proof system for BD°* are presented. BD?* 
is introduced as a minor variation of BD°*. The notion of a relational 
database, the notion of a query applicable to a relational database, and 
several notions of an answer to a query with respect to a relational database 
are defined in the setting of BD?", taking into account that a database may 
be an inconsistent database and/or a database with null values. Like in [21], 
the definitions concerned are based on those given in [26]. Two notions 
of a consistent answer to a query with respect to a possibly inconsistent 
relational database are introduced. Like in [21], one of them is reminiscent 
of the notion of a consistent answer from [6] and the other is reminiscent of 
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the notion of a consistent answer from [2]. 

The structure of this paper is as follows. In Sections 2, 3, and 4, 
the language of BD°*, the logical consequence relation of BD°*, and a 
sequent calculus proof system for BD>* are presented. In Section 5, BD?* 
is introduced as a minor variation of BD°*. In Sections 6 and 7, relational 
databases and query answering with respect to a database that may be an 
inconsistent database and/or a database with null values are considered from 
the perspective of BD°*. In Section 8, some variations of the view on null 
values in relational databases taken in this paper are briefly discussed. In 
Section 9, some concluding remarks are made. Parts of the sections in which 
BD>* is introduced overlap with parts of [22]. 


2 The Language of BD°* 


In this section the language of the logic BD°* is described. First the notions 
of a signature and an alphabet are introduced and then the terms and 
formulas of BD°* are defined for a fixed but arbitrary signature. Moreover, 
some relevant notational conventions are presented and some remarks about 
free variables and substitution are made. In coming sections, the logical 
consequence relation of BD°* and a proof system for BD>* are presented 
for a fixed but arbitrary signature. 


Signatures and Alphabets 


It is assumed that the following has been given: (a) a countably infinite 
set Var of variables, (b) for each n € N, a countably infinite set Func, of 
function symbols of arity n, and, (c) for each n € N, a countably infinite set 
Pred, of predicate symbols of arity n. It is also assumed that all these sets 
are mutually disjoint and disjoint from the set {=,7,/A,V,D,V, 4}. 

The symbols from U{ Func, | n € N} UU {Pred | n € N} are known as 
non-logical symbols. Function symbols of arity 0 are also known as constant 
symbols and predicate symbols of arity 0 are also known as proposition 
symbols. 

A signature 3 is a subset of LJ{Func, | n € N} UU{Pred, | n € N}. 
We write Func,(2’) and Pred,(5’), where X’ is a signature and n € N, for 
the sets 3’ Func, and 2M Pred, respectively. 

The language of BD°* will be defined for a fixed but arbitrary sig- 
nature ¥’. This language will be called the language of BD°* over »' or 
shortly the language of BD°*(’). The corresponding logical consequence 
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relation will be called the logical consequence relation of BD°*(2’) and a 
corresponding proof system will be called a proof system for BD°?*(2’). 

The alphabet of the language of BD°*(2’) consists of the following 
symbols: 


e the variables from Var ; 
e the non-logical symbols from » ; 
e the logical symbols of BD>*, to wit: 


— the equality symbol =; 

— the falsity connective F; 

— the negation connective 7; 

— the conjunction connective /; 
— the disjunction connective V ; 
— the implication connective Dd; 


— the universal quantifier V ; 


— the existential quantifier 4. 


Terms and Formulas 


The language of BD°*(’) consists of terms and formulas. They are con- 
structed from the symbols in the alphabet of the language of BD°*(2’) 
according to the formation rules given below. 

The set of all terms of BD°* (2), written Jerm(.2’), is inductively defined 
by the following formation rules: 


1. if e € Var, then x € Term(~’); 
2. if c € Funco(), then c € Term(2’); 


3. if f € Funcensi(%) and ty,...,tr41 € Term(%), then f(ty,...,tn4i1) € 
Term(s’). 


The set of all closed terms of BD°*(2’) is the subset of Jerm(2’) that can 
be formed by applying formation rules 2 and 3 only. 

The set of all formulas of BD°*(2’), written Form(2), is inductively 
defined by the following formation rules: 


1. if p € Predo(2), then p € Form(s); 
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2. if P € Predn4i(%) and ty,...,tn4i € Term(), then P(ti,...,tn41) € 
Form(2'); 


3. if t1,t2 € Term(2), then ty = te € Form(s); 

4. F € Form(2); 

5. if A € Form(2), then =A € Form(2); 

6. if Ay, Ag € Form(’), then A; A Ao, Ai V Ag, Ai D Ag € Form(); 


7. if x € Var and A € Jorm(), then Va. A, dre A € Form(2). 


The set Atom(2) of all atomic formulas of BD°*(2’) is the subset of Form(2’) 
can be formed by applying formation rules 1-3 only. 

We write e, = €2, where e; and eg are terms from Jerm( 2’) or formulas 
from Jorm(.’), to indicate that e; is syntactically equal to e2. 

In the coming sections, we will write CL°* for the version of classical 
logic that has the same language as BD°*. 


Notational Conventions 


The following will sometimes be used without mentioning (with or without 
decoration): x as a meta-variable ranging over all variables from Var, t as a 
meta-variable ranging over all terms from Jerm(2’), A as a meta-variable 
ranging over all formulas from Jorm(2’), and I’ as a meta-variable ranging 
over all sets of formulas from Jorm(.’). 

The string representation of terms and formulas suggested by the 
formation rules given above can lead to syntactic ambiguities. Parentheses 
are used to avoid such ambiguities. The need to use parentheses is reduced 
by ranking the precedence of the logical connectives =, A, V, D>. The 
enumeration presents this order from the highest precedence to the lowest 
precedence. Moreover, the scope of the quantifiers extends as far as possible 
to the right and Vz, ----Vap-A and 3x, ----dx, +A are usually written as 
Va1,...,% 2A and 4r1,...,2p « A, respectively. 


Free Variables and Substitution 


Free variables of a term or formula and substitution for variables in a term 
or formula are defined in the usual way. 

Let x be a variable from Var, t be a term from Jerm(2’), and e be a 
term from Jerm(2’) or a formula from Jorm(2’). Then we write [x := tle for 
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the result of substituting the term t for the free occurrences of the variable x 
in e, avoiding (by means of renaming of bound variables) free variables 
becoming bound in ft. 


3 Truth and Logical Consequence in BD>"(.’) 


In this section, the truth value of formulas of BD°*(%’) and the logical 
consequence relation on sets of formulas of BD°* (7) will be defined. This 
will be done using the logical matrix of BD°*(2’). 

First, the logical matrix of BD>*(3’) is defined. Next, the truth value of 
formulas of BD>*(2’) and the logical consequence relation on sets of formulas 
of BD°*(2’) are defined. The truth value of formulas is defined with respect 
to (a) a structure consisting of a set of values and an interpretation of each 
non-logical symbol and the equality symbol and (b) an assignment of values 
from that set of values to the variables. Structures and assignments are 
introduced before the definition in question. 


Matrix 


The interpretation of the logical symbols of BD°*(2’), with the exception of 
the equality symbol, is given by means of a logical matrix. 

In the definition of this matrix, t (true), f (false), b (both true and 
false), and n (neither true nor false) are taken as truth values. Moreover, 
use is made of the partial order < on the set {t,f,b,n} in which f is the 
least element, t is the greatest element, and b and n are incomparable. We 
write inf V and sup V, where V C {t,f,b,n}, for the greatest lower bound 
and least upper bound, respectively, of V with respect to <. 

The matrix of BD°*(2’) is the triple (V,D,O), where: 


e V = {t,f,b,n}; 
e D={t,b}; 
e O consists of the following constant and functions: 
Fev definedby F = f, 
t ifa=f 


=:VY—>YV definedby =(a) = fute=* 


a otherwise , 


166 C.A. Middelburg 


A:VxVoyV defined by A(a1, a2) inf{a1, a2}, 
ViVxVoV defined by V(a,,a2) =  sup{ay,ag}, 


5:VxVovV defined by 5(a1,a2) = { é tt, 2 
ag otherwise , 

¥:P(V) \ {0} + V defined by VV) = infV,? 

4: P(V)\ {0} + V defined by A(V) = supV, 


where a, a1, and a2 range over all truth values from V and V ranges 
over all non-empty subsets of V. 


V is the set of truth values of BD°*(’), D is the set of designated truth 
values of BD?*(X), and F, 5, A, V, 5, V, and J are the truth functions 
that are the interpretations of the logical symbols F, =, A, V, D, V, and J, 
respectively. The idea behind the designated truth values is that a formula 
is valid if its truth value with respect to all structures and assignments in 
those structures (both defined below) is a designated truth value. 

The set of non-designated truth values of BD?*(X’), written D, is the 
set V\ D. 


Structures 


The interpretation of the non-logical symbols of BD°*(2’) and the equality 
symbol is given by means of structures. 
A structure A of BD?*(5’) is a pair (U4, Z4), where: 


e UA is aset, called the domain of A, such that UA ¢ 0 and UANY = 0; 
e Z* consists of: 


— an element c € UA for every ¢ € Funco(S); 


n+1 times 
OO 
— a function f* :UA x... x UA 4 UA for every f € Funcn41() 
and n € N; 
— an element p“ € V for every p € Predo(); 
n+1 times 
a 
— a function PA : UA x... x UA > V for every P € Predn41() 


and n € N; 


2We write P(S), where S is a set, for the powerset of S. 
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—a function =“ : UA x UA > V where, for all dj,d. € UA, 


Instead of w“ we write w when it is clear from the context that the inter- 
pretation of symbol w in structure A is meant. 


Assignments 


The interpretation of the variables of BD°*(2’) is given by means of assign- 
ments. 

Let A be a structure of BD°*(2’). Then an assignment in A is a 
function a: Var + UA. For every assignment a in A, variable x € Var, and 
element d € U“, we write a(x > d) for the assignment a’ in A such that 
a’ (a) =d and a’(y) = a(y) ify # a. 


Valuations and Models 


Let A be a structure of BD°*(2’), and let a be an assignment in A. Then 
the valuation of Term(5’) in structure A under assignment a is the function 
tval& : Term() + UA that maps each term t to the element of U4 that is 
the value of t in A under assignment a. Similarly, the valuation of Form(.) 
in structure A under assignment a is the function fual& : Form(Z) > V that 
maps each formula A to the element of V that is the truth value of A in A 
under assignment a. We write [aa and [4]4 for tvalA(t) and fual&(A), 
respectively. 

These valuation functions are inductively defined in Table 1. In this 
table, x is a meta-variable ranging over all variables from Var, c is a meta- 
variable ranging over all function symbols from Funco(2’), f is a meta- 
variable ranging over all function symbols from Funcy+1(2’), p is a meta- 
variable ranging over all predicate symbols from Predo(5’), P is a meta- 
variable ranging over all predicate symbols from Predy+1(2’), t1, .--, tr44 
are meta-variables ranging over all terms from Jerm(5’), and A, Ai, and A» 
are meta-variables ranging over all formulas from Jorm(2’). 

The following theorem is a decidability result concerning valuations of 
formulas in structures with a finite domain. 


Proposition 1 Let A be a structure of BD?*(S’) such that U% is finite, 
and let a be an assignment in A. Then, it is decidable whether, for a formula 
A€ Form(), [AJA € D. 
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Table 1: Valuations of terms and formulas of BD>*(2’) 


Ze = az), 
a = cA, 
[i Gi tee S.-J ACAI os ele) 
pl Ss as 
PCat = PAG <alhwle ys 
n=b1 = = “dak lel); 
[Fl 
[-4j2 = A([4]8), 
[ArA Ae = A(LA?e, LA2l2) . 
[ArV 4.J@ = V(LA2e, 42l2) . 
[Ai > Ale = S5([Ai]2, 404), 
[Ve A]? = VTAlesa 1d UA), 
[Ax . A] = A({[Alee+sa | d € UA}) , 


Proof: This is easy to prove by induction on the structure of A. 


Below, the notion of a model of a set of formulas of BD°*(2’) is defined 
in terms of valuations. 

Let I’ be a set of formulas from Aorm(.2’), and let A be a structure of 
BD°?*(2). Then A is a model of I’ iff for all assignments a in A, for all 
AeT, [A]’ €D. 


Logical Consequence 


Given the valuations of terms and formulas of BD°*(2’), it is easy to make 
precise when in BD>*(2’) a set of formulas is a logical consequence of another 
set of formulas. 

Let I and A be sets of formulas from Jorm(2’). Then A is a logical 
consequence of I’, written I F A, iff for all structures A of BD°*(2’), for 
all assignments a in A, if [44 € D for all Ac I’, then [A714 € D for some 
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A'€ A. We write I ¥ A to indicate that it is not the case that [F A. 
The two properties concerning the logical consequence relation of 
BD°*(2’) mentioned below are proved in [22]. 
The logical consequence relation F of BD°*(2’) is such that 


TE A,A,> Ap iff A,,l— A, Ag 


for all I’, A C Form(2’) and Aj, Az € Form(’) and moreover there exists 
a logic £ with the same language as BD°*(2’) and a logical consequence 
relation F’ such that: 


e FCF; 


e the matrix (V’,D’, O’) of £ is such that V’ = {t,f}, D’ = {t}, and the 
interpretation of — in O’ is as follows: 


=a) ={ t ifa=f 


f ifa=t 
for allae VY’. 
This means that BD°*(2) is >-coherent with classical logic in the sense 
of [3]. 

There exist A, A’ € Form(2’) such that A,=A¥ A’. Because BD°*(2’) 
is also —-coherent with classical logic, this means that BD°*(2’) is paracon- 
sistent in the sense of [3]. 

There exist a ! C Aorm(2’) and A, A’ € Form(2’) such that ', AF A’ 


and I',=A F A’, but IF A’. Because BD°*(2’) is also >-coherent with 
classical logic, this means that BD°'(2’) is paracomplete in the sense of [3]. 


Abbreviations 


In what follows, the following abbreviations will be used: 


t; At. stands for 7(t, = to), 
T stands for —F, 
A; — Ag stands for (A, D Ag) A (7=A2 D 7A), 
AA stands for =(A > F) (designatedness), 
oA stands for =(A(A A —A)) (consistency), 
xA stands for A(A V =A) (determinacy). 


170 C.A. Middelburg 


It follows from the definitions concerned that: 


[A4]4 { if [Ale € {t, b} 
otherwise, 
[oA]? 


if [AJ* € {t, f, n} 
[Ala = 


t 

f 

t 

f otherwise, 

t if [AJA € {t,f, b} 

f otherwise. 
This means that the abbreviations AA, oA, and *A correspond to formulas 
of studied expansions of Belnap-Dunn logic whose connectives include A, o 
or x. The connective A is for example found in the expansion of Belnap- 
Dunn logic known as BDA [27]. The connectives o and * have for example 
been studied in the setting of Belnap-Dunn logic in [8]. The connective o 
is also found in the expansion of the first-order version of Priest’s logic of 
paradox [25] known as LP® [24]. The connective x is the counterpart of o in 
the setting of Kleene’s strong three-valued logic [15, Section 64]. 

Moreover, notice that: 


II 
——s 


t if [AJA € {t, f} 


A _ 
De { f otherwise. 


4 A Proof System for BD°*(,’) 


In this section, a sequent calculus proof system for BD°*(2’) is presented. 
This means that the inference rules have sequents as premises and conclusions. 
First, the notion of a sequent is introduced. Then, the inference rules of 
the proof system of BD°*(2’) are presented. After that, the notion of a 
derivation of a sequent from a set of sequents and the notion of a proof of a 
sequent are introduced. Extensions of the proof system of BD°*(2’) which 
can serve as proof systems for closely related logics are also described. 


Sequents 


In the sequent calculus proof system for BD°*(2’), a sequent is an expression 
of the form I => A, where I and A are finite sets of formulas from Form(.’). 
We write I’, I” for PUI" and A, where A is a formula from Form(.2’), for 
{A} on both sides of a sequent. Moreover, we write = A instead of J > A. 

A sequent I’ => A states that the logical consequence relation that is 
defined in Section 3 holds between I and A. If a sequent = A can be 
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proved by means of the rules of inference given below, then that logical 
consequence relation holds between I’ and A. 


Rules of Inference 


The sequent calculus proof system for BD°*(2’) consists of the inference 
rules given in Table 2. In this table, x and y are meta-variables ranging over 
all variables from Var, t, t;, and tg are meta-variables ranging over all terms 
from Jerm(’), A, Ai, and Ag are meta-variables ranging over all formulas 
from Jorm(+’), and I’ and A are meta-variables ranging over all finite sets 
of formulas from Form(.’). 


Derivations and Proofs 


In the sequent calculus proof system for BD°*(2’), a derivation of a sequent 
I => A from a finite set of sequents H is a finite sequence (s1,..., 8p) of 
sequents such that s,, equals [ = A and, for each 7 € {1,...,n}, one of the 
following conditions holds: 


es, EH; 


e s; is the conclusion of an instance of some inference rule from the proof 
system of BD°*(2’) whose premises are among 51,..., Si-1. 


A proof of a sequent I = A is a derivation of = A from the empty set of 
sequents. A sequent I’ = A is said to be provable if there exists a proof of 
rsa. 

Let I and A be sets of formulas from Jorm(+’). Then A is derivable 
from I’, written I’ A, iff there exist finite sets I’ C I and A’ C A such 
that the sequent I’ > A’ is provable. 

The sequent calculus proof system of BD°*(2’) is sound and complete 
with respect to the logical consequence relation F defined in Section 3. 


Theorem 1 Let I and A be sets of formulas from Form(s’). Then + A 
if Te A. 


Proof: See Appendix A of [22]. 
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Table 2: A sequent calculus proof system for BD°* 
a Ga rs3A,A A,I’> A’ 
A, => A,A * rrsal 
F-L |——_— 
F.rs>A 
Aj, Ao, >A [=> A,A, TI = A, Ag 
A-L A-R 
Aj A Ag,P SA [=> A, A, A Ag 
AylT=sA Ao,T=>Aa rT A, Ai, Ao 
V-L V-R 
AyjVA,,r°>A r A, A; V Ag 
r A, Ay Arr =>A A,,I => A, Ae 
D-L D-R 
AyD 4,,rS3A T A, A; D Ag 
ar [c:=tA,rsA on TS A,|[x:=yjA 
+ Yr. APSA : r=>A,Vrc.A f 
= [jce:=yJA,TSA =a TsSA,[a:=t]A 
: az. A, sa — Pr>A,ic.A 
iF 
A => A,-F 
A,rsA r>A,A 
1 -L 1 -R 
AA,PsAa ['>A,-7-A 
Ay, SA 7AAn, TSA T= A,7A,,7A2 
A-L “A-R 
(A, A Ao), SA rT A, 1 A; \ Az) 
43A,,742,0°>3A [> A,7A, [> A,7A, 
V-L V-R 
(A, V Ao), > A T => A,7(A,; V Az) 
Aj, 7Ao,I => A Lr A, Ai r> A, 7 A2 
1D-L 1D-R 
(Ay 2) Ao), rsaAa r A, (Ay =) Ag) 
4 [2 := yA, P= A oar P= A,A7{2 := tA 
— Ve.A,T>A —l Ts A,r. A 
aT a[c:=tA,Prs>A an [=> A,7[x := yJA ‘ 
: dz.A,c> A - [= A,-dc.A 
Ral t=t,rsA Pel [je:=)JA,r> Aa 
a2 . A aii t, = te, [x = to]A, I >A 


} restriction: y is not free in I’, y is not free in A, y is not free in A unless x = y. 
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Extensions of the Proof System 


The languages of CL°* and BD°* are the same. A sound and complete 
sequent calculus proof system of CL°* can be obtained by adding the 
following two inference rules to the sequent calculus proof system of BD>*: 
r=>A,A A,r=A 


We. freA APS ASA 


If we add only the inference rule —-R to the sequent calculus proof 
system of BD°*, then we obtain a sound and complete proof system of the 
paraconsistent (but not paracomplete) logic LPQ°* presented in [21]. If 
we add only the inference rule —-L to the sequent calculus proof system of 
BD>*, then we obtain a sound and complete proof system of the obvious 
first-order version of the paracomplete (but not paraconsistent) propositional 
logic K3°* presented in [20]. 


5 Covering Terms with an Indeterminate Value 


This section goes into a minor variation of BD?*(’), called BD?" (2), that 
can deal with terms with an indeterminate value. Semantically, this is 
handled by restricting the domain of structures to sets that contain a special 
dummy value | and assigning this value to terms with an indeterminate 
value. Thus, | is treated as a genuine value. It differs from other values 
only by conditions imposed on the equality relation of structures (see below). 
The conditions concerned fit in with the intuition that a term with an 
indeterminate value has an unknown value or does not have a value. 

The language of BD?*(’) and the language of BD?" (2) are the same. 
The logical consequence relation of BD? (2) is defined as for BD?*(3’), but 
in terms of structures of BD?" (2). These structures differ slightly from the 
structures of BD°* (2). 


Structures 


A structure A of BD?*() is a pair (U4, 7), where: 


e UA is aset, called the domain of A, such that | €U4, UA\{L} 490, 
and UANY =O; 


e Z“ consists of: 
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— an element c“ € U% for every ¢ € Funco(S); 


n+1 times 


a 
— a function f4 : UA x... x UA 4 UA for every f € Funcn41(Z) 
and n € N; 


— an element p“ € V for every p € Predo(); 


n+1 times 


a 
— a function PA : UA x... x UA = V for every P € Predn41() 
and n € N; 


— a function =“ : UA x UA > V where, for all d,, dz € UA: 


=A (dy, do) Ee D iff dy, dg € uA \ {1} and d, = da; 
=A (dy, do) =n iff dy = lor dy ie 


By the first condition imposed on =“ in the definition of a structure A 
of BD?" (2), [t = tJ“ € D iff [4] A 1. In other words, the truth value of 
the formula t = ¢ is a designated truth value iff the value of the term t is 
determinate. 

By the second condition imposed on =“ in the definition of a structure A 
of BD?*(S), [t1 = ta|* =n iff [t:J4 = 1 or [t2]|4 = L. In other words, the 
truth value of an equation t; = tg is neither true nor false (n) iff the value 
of t; or tg or both is indeterminate. Similar conditions are not imposed on 
the interpretations of the function and predicate symbols from 3’. However, 
because of the first condition imposed on =“, the relevant conditions can 
be expressed in BD?*(2’). 


Logical Consequence 


Let I and A be sets of formulas from Jorm(2’). Then A is a logical 
consequence of I’, written  F, A, iff for all structures A of BD?" (2), for 
all assignments a in A, if [AJ* € D for all A € I, then [A714 € D for some 
Ae Ns 

By the conditions on =“ in the definition of a structure A of BD?" (2), 
we have that, for all t),t2 € Jerm(): 


ty =t) Ato=toF, ty = fo Vad 19) 5 
ty =to V(t) = te) FL tp =ti Ate =te. 
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Proof System 


A sequent calculus proof system of BD?*(2’) is obtained by replacing the 
inference rule =-Refl by the following two inference rules in the sequent 
calculus proof system of BD°*(2’): 


ty = ty, tg = te, T A 
ti =tVtAt,rSA 


6-=-L 


T A,t) =ty [=> A, to = to 
TSA,tp=toVt, Ft 


6-=-R 


The resulting proof system is sound and complete. It is easy to see that the 
resulting proof system is sound. As explained in [22], the completeness proof 
requires a minor adaptation of the completeness proof for the proof system 
of BD°*(3’) given in that paper. 


Abbreviations 


In what follows, the following abbreviations will be used: 


tL stands for A(t = t) (term determinacy), 
ty == tg stands for ty) = te V 7(til V tel) (strong equality). 


It follows from these definitions that: 


(ely _ { t if (A euA\ {1} 


f otherwise, 


f==—1° =." if [fiJy = L and [t2JQ = 1 
po to) =A (tJ, [f2]4) otherwise. 


Notes 


BD?*(2’) looks like a free logic [17, 23] that is paraconsistent and para- 
complete. However, | is included in the range of variables in BD?*(2), 
whereas | would be excluded from the range of variables in a free logic. This 
difference reflects that is treated as a genuine value in BD?*(2’) and L is 
not treated as a genuine value in free logics. 

BD?*(2) has been devised to deal with logical theories (i.e. sets of 
formulas) of the kind that arise for example when possibly inconsistent 
relational databases with possibly null values are viewed as logical theories. 
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6 Relational Databases Viewed through BD?" 


In this section, relational databases are considered from the perspective of 
BD?", taking into account that a database may be an inconsistent database 
(cf. [6, 2]) and/or a database with null values (cf. [830, 31, 14]). The proof 
theoretic point of view is taken, i.e. a relational database is viewed as a 
logical theory. In the definition of the notion of a relational database, use is 
made of the notions of a relational language and a relational theory. The 
latter two notions are defined first. The definitions given in this section are 
to a great extent based on those given in [26]. However, types are ignored 
for the sake of simplicity (cf. [11, 29]). The view on null values in relational 
databases taken here is informally described in Section 1. 


Relational Languages 


The pair (2’, Jorm(’)), where 5’ is a signature, is called the language of 
BD?" (2). If © satisfies particular conditions, then the language of BD?* (2) 
is considered a relational language. 

Let ©’ be asignature. Then the language R = (2, Form(2)) of BD?" () 
is a relational language iff it satisfies the following conditions: 


e Funco(’) is finite, nil € Funco(2’), and Funco(2’) \ {nil} is non-empty; 
e U{Funcn+41(2’) | n € N} is empty; 
e Predo(2’) is empty; 


e Uf{Predn+1(2’) | n € N} is finite. 


Relational Theories 


Below, we will introduce the notion of a relational theory. In the definition 
of a relational theory, use is made of a number of auxiliary notions. These 
auxiliary notions are defined first. 

Let R = (',Form(’)) be a relational language. Then an atomic 
fact for R is a formula from Jorm(.’) of the form P(ci,...,¢n41), where 
P € Prednii(’) and cy,...,€n41 € Funco(%’). 

Let R = (2, Form(2’)) be a relational language. Then the nil-indeter- 
minacy aziom for R is the formula 


—(nilf) 
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and the equality semi-normality axiom for R is the formula 
Va,0 eo(f2@=a2')A((2LAz'l) ox(2=2')). 


Let R = (2, Form(2’)) be a relational language and let c),...,¢, be all 
members of Funco(5’). Then the domain closure axiom for R is the formula 


Nes (2=— Gf Via VES GG) 
and the unique name axiom set for R is the set of formulas 
(te ==<c,) |Lat<3 a7} 


Let R = (2', Form(2’)) be a relational language and let P € Predy+1( 2’) 
(n EN). Then the P-determinacy axiom for R is the formula 


VE4y 00 >Unt+1 exP(x1,. ee act.) ‘ 


Let R = (X’, Form(’)) be a relational language, let A C Form(2’) be a 
finite set of atomic facts for R, and let P € Predy+1(2’) (n € N). Suppose 
that there exist formulas in A in which P occurs and let P(ct,..., oom S 

.., P(c?,...,c74.4) be all formulas from A in which P occurs. Then the 
P-completion axiom for A is the formula 


Vii Eye Pie ss Ppa) SS 
1 


SO) Ah ea Ve VY pa As a 
Suppose that there does not exist a formula in A in which P occurs. Then 
the P-completion axiom for A is the formula 


Vaiss ..4 ting @ Pipe. tei) + F 


The domain closure, unique name, and P-completion axioms are adopted 
from [26]. The nil-indeterminacy, equality semi-normality, and P-determinacy 
axioms are new. The nil-indeterminacy axiom states that the value of nil 
is indeterminate. The equality semi-normality axiom states that equations 
are interpreted classically except that their truth value is neither true nor 
false if terms with an indeterminate value are involved. The P-determinacy 
axiom states that P never yields the truth value neither true nor false. 

Let R = (5’, Form(2)) be a relational language. Then the relational 
structure axioms for R, written RSA(R), is the set of all formulas A € 
Form(.’) for which one of the following holds: 
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A is the nil-indeterminacy axiom for R; 

e A is the equality semi-normality axiom for R; 

e A is the domain closure axiom for R; 

e A is an element of the unique name axiom set for R; 


e Ais the P-determinacy axiom for R 
for some P € U{Predn+41(2’) | n € N}. 


Let R = (2, Form(2’)) be a relational language, and let A C Form(2’) 
be a finite set of atomic facts for R. Then the relational theory for R with 
basis A, written RT(R, A), is the set of all formulas A € Form(.’) for which 
one of the following holds: 


e Ac RSA(R); 
e Ac A; 


e Ais the P-completion axiom for A 
for some P € U{Predn+41(2’) | n € N}. 


A set 9 C Form(.’) is called a relational theory for R if 09 = RT(R, A) for 
some finite set A C Form(2’) of atomic facts for R. The elements of this 
unique A are called the atomic facts of O. 

The following theorem is a decidability result concerning provability of 
sequents [’ = A where I includes the relational structure axioms for some 
relational language. 


Theorem 2 Let R = (2', Form(2)) be a relational language, and let I’ be 
a finite subset of Form(2’) such that RSA(R) C I. Then it is decidable 
whether, for a formula A € Form(s’), [ => A is provable. 


Proof: Since it is known from Theorem 1 that = A is provable iff f — A, 
it is shown instead that it is decidable whether, for a formula A € Form(~), 
TEA. 

Because RSA(R) C I, it is sufficient to consider only structures that 
are models of RSA(R). The domains of these structures have the same finite 
cardinality. Because in addition there are finitely many predicate symbols 
in 4’, there exist moreover only finitely many of these structures. 

Clearly, it is sufficient to consider only the restrictions of assignments 
to the set of all variables occurring in [’U {A}. Because the set of all 
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variable occurring in I’ U {A} is finite and the domain of the structures to 
be considered is finite, there exist only finitely many such restrictions and 
those restrictions are finite. 

It follows easily from the above-mentioned finiteness properties and 
Proposition 1 that it is decidable whether, for a formula A € Jorm(2), 
TEA. 


Relational Databases 


Having defined the notions of an relational language and a relational theory, 
we are ready to define the notion of a relational database in the setting of 
BD?" 

A relational database DB is a triple (R, 0, =), where: 


e R=(, Form(~)) is a relational language; 
e O is a relational theory for R; 
e = is a finite subset of Jorm(2’). 


@ is called the relational theory of DB and © is called the set of integrity 
constraints of DB. 

The set © of integrity constraints of a relational database DB = 
(R, O,=) can be seen as a set of assumptions about the relational the- 
ory of the relational database Q. If the relational theory agrees with these 
assumptions, then the relational database is called consistent. 

Let R = (57, Form(’)) be a relational language, and let DB = (R, 0, =) 
be a relational database. Then DB is consistent iff, for each A € Form(’) 
such that A is an atomic fact for R or A is of the form =A’ where A’ is an 
atomic fact for R: 


0 = A is provable only if 90,5 => oA is provable. 


Notice that, if DB is not consistent, 09,2 = A’ is provable with the 
sequent calculus proof system of CL°*(2’) for all A’ € Form(2’). However, 
the sequent calculus proof system of BD?*(2’) rules out such an explosion. 


Models of Relational Theories 


The models of relational theories for a relational language R = (4', Form(2’)) 
are structures of BD?" (27) of a special kind. 
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Let R = (2',Form()) be a relational language. Then a relational 
structure for R is a structure A of BD?*(2) such that: 


e nil4 = 1; 
e for all dj, dz € UA: 


=A(d1, dz) # b; 
if dj A L and d; # 1, then =4(d,,d2) An; 


for alld € UA: 


if d # L, then there exists a c € Funco(2’) such that =“(d,c*) = t; 
if d = 1, then there exists a c € Funco(2’) such that =“(d,c“) = n; 


for all cy, cg € Funco(2’): 


if) An and: ="(@"o") SF wand =“(qj4,e07) St 
or 
=A(e4 a") = n and ="(e*, 6") =n, 
then cy = ©9; 


e for alln EN, for all P € Predy41(Z), for all dy,...,dn41 €U*: 
PP dis: napa) - n. 


The following corollary of the definitions of relational structure axioms for R 
and relational structure for R justifies the term “relational structure axioms”. 


Corollary 1 Let R= (2',Form(’)) be a relational language, and let A be 
a structure of BD?"(2’). Then A is a relational structure for R iff, for all 
assignments a in A, for all A€ RSA(R), [AJ* €D. 


Let R = (2, Form(2’)) be a relational language, and let O be a relational 
theory for R. Then all models of O are relational structures for R because 
RSA(R) C O. O does not have a unique model up to isomorphism. 0’s 
predicate completion axioms fail to enforce a unique model up to isomorphism. 
However, identification of t and b in the models of O yields uniqueness up 
to isomorphism. 

Let R = (2, Form(»)) be a relational language, and let A be a relational 
structure for R. Then we write VA for the relational structure A’ for R 
such that: 
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< Ue’ = UA. 


/ 


e for each c € Funco(Z), c® = cA; 


e for each n EN, for each P € Pred,+41(5), for each dj,...,dn41 € UA’, 


re ft PPA digsscytdada) ]4tb} 
NE sgn) = { f otherwise; 


e for each dy, dy €U*’, =4' (di, dz) = =“(di, de). 


Theorem 3 Let R = (27, 7orm(.’)) be a relational language, let O be a 
relational theory for R, and let A and A’ be models of 9. Then VA and VA! 


are isomorphic relational structures. 


Proof: The proof goes in almost the same way as the proof of part 1 of 
Theorem 3.1 from [26]. The only point of attention is that it may be the case 
that, for some P € Predy+1(2) and c1,...,Cn41 € Funco() (n € N), either 
[P(c1,.--¢n41)|* =t and [P(c1,...,¢n41))* =b or [P(c1,.-., ene)“ =b 
and [P(c1,..., eal = t. But, if this is the case, [P(c1,..., Gaal =t 


and [P(cj;+.. Qua =t. 


Qa 


Theorem 4 Let R= (2', Form()) be a relational language, and let A be 
a relational structure for R. Then there exists a relational theory O for R 
such that A is a model of O. 


Proof: The proof goes in the same way as the proof of part 2 of Theorem 3.1 
from [26]. 


7 Query Answering Viewed through BD?* 


In this section, queries applicable to a relational database and their answers 
are considered from the perspective of BD?*. As a matter of fact, the queries 
introduced below are closely related to the relational-calculus-oriented queries 
originally introduced by [9]. Three kinds of answers to queries are introduced. 
Two of them take the integrity constraint of the database into account. They 
differ in their approach to deal with inconsistencies. A discussion of these 
approaches can be found in [21, Section 8}. 
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Queries 


As to be expected in the current setting, a query applicable to a relational 
database involves a formula of BD?*. 
Let R = (X’, Form(»’)) be a relational language. Then a query for R is 


an expression of the form (x1,...,2n) « A, where: 
@ 11,.--,%n € Var; 
e Ace Form() and all variables that are free in A are among 71,...,%n. 


Let DB = (R, 90, =) be a relational database. Then a query is applicable 
to DB iff it is a query for R. 


Answers 


Answering a query with respect to a consistent relational database amounts 
to looking for closed instances of the formula concerned that are logical con- 
sequences of a relational theory. The main issue concerning query answering 
is how to deal with inconsistent relational databases. 

Let R = (2, Form()) be a relational language, let DB = (R, 0, =) be 
a relational database, and let (21,...,2n)+«A be a query that is applicable to 
DB. Then an answer to (%1,...,%p)+«A with respect to DB isa (c1,...,C€n) € 
Funco(2’)” for which 0 => [x1 := cj]... [@n := cylA is provable. 

The above definition of an answer to a query with respect to a database 
does not take into account the integrity constraints of the database concerned. 


Consistent Answers 


The definition of a consistent answer given below is based on the following: 


e the formula that corresponds to an answer is a logical consequence of 
some set of atomic facts and negations of atomic facts that are logical 
consequences of the relational theory of the database; 


e in the case of a consistent answer there must be such a set that does 
not contain an atomic fact or negation of an atomic fact that causes 
the database to be inconsistent. 


Let R = (5’, Form(2’)) be a relational language. Then a semi-atomic 
fact for R is an A € Form() such that A is an atomic fact for R or A is of 
the form =A’ where A’ is an atomic fact for R. 
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Let R = (2, Aorm(~)) be a relational language, let DB = (R, 0, =) be 
a relational database, and let (21,...,2,) «A be a query that is applicable 
to DB. Then a consistent answer to (#1,...,%n) «A with respect to DB isa 
(c1,---,Cn) € Funco(’)” for which there exists a set ® of semi-atomic facts 
for R such that: 


e for all A’ € 6, O = A’ is provable and 90, 5 = oA’ is provable; 
e $, RSA(R) => [21 := 1]... [2n := cplA is provable. 


The above definition of a consistent answer to a query with respect to 
a database is reminiscent of the definition of a consistent answer to a query 
with respect to a database given in [6]. It simply accepts that a database 
is inconsistent and excludes the source or sources of the inconsistency from 
being used in consistent query answering. 


Strongly Consistent Answers 


The definition of a strongly consistent answer given below is not so tolerant 
of inconsistency and makes use of consistent repairs of the database. The 
idea is that an answer is strongly consistent if it is an answer with respect 
to every minimally repaired version of the original database. 

Let R = (2, Form(2’)) be a relational language, and let A C Form(2’) 
be a finite set of atomic facts for R. Then, following [2], the binary relation 
<y, on the set of all finite sets of atomic facts for R is defined by: 


A! <q A" iff (A\ AU (A'\ A) C (A\ A") U(A"\ A). 


Intuitively, A’ <, A” indicates that the extent to which A’ differs from A is 
less than the extent to which A” differs from A. 

Let R = (X’, Form(2’)) be a relational language, let A C Form(2’) be a 
finite set of atomic facts for R, and let 5 is a finite subset of Jorm(2’). Then A 
is consistent with & iff for all semi-atomic facts A for R, RT(R, A) > Ais 
provable only if RT(R, A), 5 = oA is provable. We write Con(=) for the 
set of all finite sets of atomic facts for R that are consistent with =. 

Let R = (X’, Form(’)) be a relational language, let A C Form(2’) be a 
finite set of atomic facts for R, let DB = (R, RT(R, A), =) be a relational 
database, and let (21,...,%,) « A be a query that is applicable to DB. 
Then a strongly consistent answer to (%1,...,%n) «A with respect to DB 
is a (C1,---,Cn) € Funco(2’)” such that, for each A’ that is <,-minimal in 
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Con(£), RT(R, A’) => [x1 := C1]... [@n := Cp]A is provable. The elements of 
Con(=) that are <,4-minimal in Con(&) are called the repairs of A. 

The above definition of a strongly consistent answer to a query with 
respect to a database is essentially the same as the definition of a consistent 
answer to a query with respect to a database given in [2]. It represents, 
presumably, the first view on what the repairs of an inconsistent database 
are. Other views on what the repairs of an inconsistent database are have 
been taken in e.g. [4, 7, 13, 19, 28]. 


Null Values in Answers 


If (c1,...,€n) is an answer, consistent answer or strongly consistent answer 
to a query with respect to a database with null values, then each of ci, ..., 
Cn May be syntactically equal to nil. Such answers are called answers with 
null values. In most work on query answering with respect to a database with 
null values, answers with null values are not considered. Notable exceptions 
are (14, 18], where the view taken on null values in relational databases is 
different from the view taken in this paper. 


Decidability 


The following theorem is a decidability result concerning being an answer to 
a query with respect to a database. 


Theorem 5 Let R = (2',Form(2)) be a relational language, let DB = 
(R, 0, ©) be a relational database, and let (11,...,%n)+A be a query applicable 
to DB. Then it is decidable whether, for (c1,...,€n) € Funco(2’)”: 


e (C1,.--,€n) 1s an answer to (%1,...,%) «A with respect to DB; 
@ (C1,.--,Cn) is a consistent answer to (%1,...,%p)eA with respect to DB; 
e (c1,.--,€n) is a strongly consistent answer to (#1,...,%n) «A with 


respect to DB. 


Proof: Each of these decidability results follows immediately from 
Theorem 2 and the definition of the kind of answer concerned. 


As a corollary of Theorem 5, we have that the set of answers to a query, 
the set of consistent answers to a query, and the set of strongly consistent 
answers to a query are computable. 
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Example 


The example given below is kept extremely simple so that readers that are 
not initiated in the sequent calculus proof system of BD?* can understand 
the remarks made about the provability of sequents. 

Consider the relational database whose relational language, say R, has 
constant symbols a, b, c, d, and nil and ternary predicate symbol P, whose 
relational theory is the relational theory of which P(a,b, nil), P(a,nil,c), 
P(a,nil,d), and P(b,c,d) are the atomic facts, and whose single integrity 
constraint is Vz, y,z,y',2' « (P(z,y,z) A P(z,y',2')) > y = y’.. Moreover, 
consider the query xe dy, z, 2’ + P(z,y,z) A P(a,y, 2!) Az 4 2’. Clearly, the 
set of answers is {a}. 

The sets of semi-atomic formulas that are logical consequences of the 
relational theory and do not cause the database to be inconsistent include 
only one of P(a, 6, nil), P(a, nil, c), and P(a, nil,d). This means that, for each 
such set, say ®, there is no constant symbol k € {a,b,c,d,nil} such that 
&, RSA(R) = [y := kl (Aa, z, 2’ « P(z,y,z) A P(az,y, 2') Az # 2’) is provable. 
Hence, the set of consistent answers is 9. 

The repairs of { P(a, b, nil), P(a, nil, c), P(a, nil, d), P(b, c, d)} include only 
one of P(a,b,nil), P(a,nil,c), and P(a,nil,d). This means that, for each 
repair, say A, there is no constant symbol k € {a,b,c,d,nil} such that 
RT(R, A) => ly := k] (Aza, z, 2’ - P(a,y, z) A P(x, y, 2’) Az & 2’) is provable. 
Hence, the set of strongly consistent answers is also 0. 

Now, consider the relational database that differs from the one described 
above only in that in the integrity constraint y = y’ is replaced by y == y/. 
It is easy to see that in this case the set of answers and the set of strongly 
consistent answers to the query described above remain the same, but the 
set of consistent answers becomes {a}. 


Several examples in which null values play no role can be found in [21]. 


8 The Views on Null Values in Databases 


In this short section, some remarks are made on the different views that 
exist on null values in relational databases. 


Our View 


The view taken in this paper on null values in relational databases can be 
summarized as follows: 
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e in relational databases with null values, a single dummy value, called 
the null value, is used for values that are indeterminate; 


e a value that is indeterminate is a value that is either unknown or 
nonexistent; 


e independent of whether it is unknown or nonexistent, no meaningful 
answer can be given to the question whether the null value and whatever 
value, including the null value itself, are the same. 


In the database literature, a null value for all values that are indeterminate, 
i.e. for all values that are either unknown or nonexistent, is usually called 
a no information null value. Moreover, the term inapplicable is often used 
instead of nonexistent. 


Other Views 


Firstly, the view taken in this paper means that there are no separate null 
values for values that are unknown and values that are nonexistent. Separate 
null values for values that are unknown and values that are nonexistent have 
for example been studied in [12, 30]. In those papers, query answering is 
based on a four-valued logic devised to deal with the two different null values. 
The logics concerned differ from each other and both differ a lot from BD?*. 

Secondly, the view taken in this paper means that the values for which 
the null value is a dummy value is not restricted to values that are unknown. 
A single null value, for values that are unknown only, has for example been 
studied in [5, 10, 14]. In [14] and later publications in which this kind of 
null value is considered, query answering with respect to a database with 
null values is usually approached in a way that is based on the idea that 
a database with null values represents a set of possible databases without 
null values. This approach is reminiscent of the database repair approach to 
query answering with respect to an inconsistent database from [2]. 

Thirdly, the view taken in this paper means that there are no multiple 
null values. Multiple null values, each for values that are unknown only, has 
for example been studied in [14, 18]. In those papers, the same approach to 
query answering is applied as in [14] to the case of a single null value. 

A single null value for both values that are unknown and values that are 
nonexistent has also been studied before in [16, 31]. In [31], query answering 
is based on a three-valued logic that is closely related to BD?*. In [16], an 
attempt is made to apply the approach to query answering from [14] to the 
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case of a single null value for both values that are unknown and values that 
are nonexistent. 

The fact that so many views on null values in relational databases have 
been studied indicates that time and again the views already studied turned 
out to be unsatisfactory from one angle or another. The view taken in this 
paper is primarily based on the consideration that it should be relatively 
simple from a logical perspective, taking into account that a database is 
anyhow no more than a representation of an approximation of a piece 
of reality. 


The Equality Issue 


The view taken in this paper is essentially the same as the view taken in [31] 
and is based on similar considerations. There is one aspect of this view that 
is often considered undesirable in work on null values in relational databases, 
namely the point that no meaningful answer can be given to the question 
whether the null value equals itself. Made precise in the setting of this paper, 
it is the following that is often considered undesirable: 


[é: =te]* =n if [ti[* = [nil] and [t2]4 = [nil] . 


The main reason that is given to consider this undesirable is that it may 
lead to counterintuitive answers to queries. However, this is only so in the 
case where it is (implicitly) assumed that the null value is a dummy value 
for values that are unknown only. In that case, the following is considered 
more appropriate: 


[t: =toJ4=t if [Jo = [nilJ4 and [t.]4 = [ni] 


(see e.g. [18]). 
It is worth mentioning that the adapted equality predicate can be 
expressed in the setting of this paper. Consider the abbreviation t; == tg 


introduced earlier. As mentioned before, it follows from the definition of 
this abbreviation that: 


eee ee ae if [iJ = Land [t2}Q = 1 
bo to) =A (tA, [te]4) otherwise. 


Clearly, == corresponds to the adapted equality predicate. This means that, 
although the definitions in Sections 6 and 7 are based on the view that the 


188 C.A. Middelburg 


null value is a dummy value for both values that are unknown and values 
that are nonexistent, BD?* allows for treating the null value as a dummy 
value for values that are unknown only. 


9 Concluding Remarks 


In this paper, a coherent logical view on relational databases and query 
answering is presented that takes into account the possibility that a database 
is a database with null values and the possibility that a database is an 
inconsistent database. The presented view combines: 


e the proof-theoretic logical view of Reiter [26] on what is a relational 
database, a query applicable to a relational database, and an answer 
to a query with respect to a consistent relational database without 
null values; 


e the view of Zaniolo [31] on null values in relational databases; 


e the view of Bry [6] as well as the view of Arenas et al [2] on what is a 
consistent answer to a query with respect to an inconsistent relational 
database. 


The view is expressed in the setting of the paraconsistent and paracomplete 
logic BD?*. In a logic that is not paracomplete, it would have been difficult 
to take properly into account the possibility that a database is a database 
with null values, and in a logic that is not paraconsistent, it would have been 
difficult to take properly into account the possibility that a database is an 
inconsistent database. 

The main views on what is a consistent answer to a query with respect 
to an inconsistent relational database are the view of Bry [6] and the view of 
Arenas et al [2]. The latter view is based on a notion of a minimal repair of 
an inconsistent database. Most other views on consistent query answering, 
for example the views presented in [4, 7, 13, 19, 28], are based on a notion 
of a minimal repair of an inconsistent database that differs from the one 
from [2] by different choices concerning, among other things, the kinds of 
changes (deletions, additions, alterations) that may be made to the original 
database to obtain a repair and what is taken as the extent to which two 
databases differ. 

There are four main views on null values in relational databases, namely 
the view of Zaniolo [31], two views of Imielitiski and Lipski [14], and the 
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view of Vassilion [30]. Other views are mostly variations on one of these 
views. The Codd-table view from [14], in which there are no multiple null 
values, can be expressed in the setting of BD?" because the single null value 
can be treated as in that view by using == instead of =. In the V-table 
view from [14], there may be multiple null values (called marked nulls). A 
variation of BD?" is needed to express this view. In the view from [30], 
there are separate null values for values that are unknown and values that 
are nonexistent. It is questionable whether BD?” is suitable to express this 
view on null values in relational databases, in particular if the possibility 
that a databases is inconsistent has to be taken into account as well. 

It is shown in this paper that, for each of the three kinds of answer 
introduced, being an answer to a query with respect to a relational database 
is decidable. However, it is to be expected that query answering for these 
kinds of answers is intractable without severe restrictions on the integrity 
constraints and/or the use of null values. A study of the computational 
complexity of query answering for these kinds of answers in the setting of 
BD?" is an interesting option for future work. 
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